Do an initial poll of the `ChannelManager` write future immediately #3977

TheBlueMatt · 2025-07-31T01:41:49Z

In 7d856fe5b05d69b301fdb8a48352f3f3a16efca9 we moved to doing all of our persistence operations in the (async) background processor in parallel. This is great for slow persistence operations (eg with remote persistence). However, the ChannelManager persistence is very latency-sensitive (delays may lead to accidental force-closures today) and thus we don't really want to wait on network graph pruning or scorer time-stepping (which can be somewhat slow).

Instead, we keep the new bulk-polling of the persistence operations but immediately do a single step of the ChannelManager persistence future after it is created. This should give it a chance to get going - initializing whatever network sends or connections need to happen - before we get busy doing CPU-intensive stuff. While it may not actually make the persist happen in full, it at least gives it a chance, and should make as much progress as possible until a network resources is exhausted (eg send buffer is full), at which point we need to wait on the network which is almost certainly slow than the 1ms or so it will take to time-step the scorer.

In 7d856fe5b05d69b301fdb8a48352f3f3a16efca9 we moved to doing all of our persistence operations in the (async) background processor in parallel. This is great for slow persistence operations (eg with remote persistence). However, the `ChannelManager` persistence is very latency-sensitive (delays may lead to accidental force-closures today) and thus we don't really want to wait on network graph pruning or scorer time-stepping (which can be somewhat slow). Instead, we keep the new bulk-polling of the persistence operations but immediately do a single step of the `ChannelManager` persistence future after it is created. This should give it a chance to get going - initializing whatever network sends or connections need to happen - before we get busy doing CPU-intensive stuff. While it may not actually make the persist happen in full, it at least gives it a chance, and should make as much progress as possible until a network resources is exhausted (eg send buffer is full), at which point we need to wait on the network which is almost certainly slow than the 1ms or so it will take to time-step the scorer.

ldk-reviews-bot · 2025-07-31T01:41:51Z

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

codecov · 2025-07-31T01:48:27Z

Codecov Report

❌ Patch coverage is 88.88889% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.89%. Comparing base (664511b) to head (91bce83).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning-background-processor/src/lib.rs	88.88%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3977      +/-   ##
==========================================
- Coverage   88.91%   88.89%   -0.03%     
==========================================
  Files         174      174              
  Lines      124232   124240       +8     
  Branches   124232   124240       +8     
==========================================
- Hits       110455   110437      -18     
- Misses      11301    11320      +19     
- Partials     2476     2483       +7

Flag	Coverage Δ
fuzzing	`22.61% <ø> (ø)`
tests	`88.71% <88.88%> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

joostjager · 2025-07-31T04:48:08Z

lightning-background-processor/src/lib.rs

+			// Because persisting the ChannelManager is important to avoid accidental
+			// force-closures, go ahead and poll the future once before we do slightly more
+			// CPU-intensive tasks in the form of NetworkGraph pruning or scorer time-stepping
+			// below. This will get it moving but won't block us for too long if the underlying


If this is needed, shouldn't we stay on the safe side and await cm persistence separately? (not include it in the joiner)

I don't think so - time-stepping the scorer takes something on the order of 1ms, so its going to be dwarfed by any network latency anyway. On the flip side, remote persistence may take something like 500ms if we have a bit high internet latency and there's a few RTTs. Thus, parallelizing the persistences across all the cases is really important for overall application latency, but at the same time we want to get things moving quick, so its worth polling once to get network stuff going.

To some extent this is really because we're trying to optimize for two drastically different cases - async persist with a local DB that is 0.5ms away and async persist with a remote VSS store that is 500ms away. But I do think this is basically a free way to optimize for both - there's no real cost to polling once, aside from a bit of extra code, and it lets us get ChannelManager persisted fast but doesn't stall the whole thing on net.

Ok, buying this.

ldk-reviews-bot · 2025-07-31T04:48:20Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

TheBlueMatt · 2025-07-31T19:06:32Z

Just gonna land this, its simple and straightforward.

TheBlueMatt requested a review from joostjager July 31, 2025 01:41

joostjager reviewed Jul 31, 2025

View reviewed changes

joostjager approved these changes Jul 31, 2025

View reviewed changes

TheBlueMatt merged commit 7e68f54 into lightningdevkit:main Jul 31, 2025
24 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do an initial poll of the `ChannelManager` write future immediately #3977

Do an initial poll of the `ChannelManager` write future immediately #3977

Uh oh!

TheBlueMatt commented Jul 31, 2025

Uh oh!

ldk-reviews-bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

joostjager Jul 31, 2025 •

edited

Loading

Uh oh!

TheBlueMatt Jul 31, 2025

Uh oh!

TheBlueMatt Jul 31, 2025

Uh oh!

joostjager Jul 31, 2025

Uh oh!

ldk-reviews-bot commented Jul 31, 2025

Uh oh!

TheBlueMatt commented Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Do an initial poll of the ChannelManager write future immediately #3977

Do an initial poll of the ChannelManager write future immediately #3977

Uh oh!

Conversation

TheBlueMatt commented Jul 31, 2025

Uh oh!

ldk-reviews-bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

joostjager Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

joostjager Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Jul 31, 2025

Uh oh!

TheBlueMatt commented Jul 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Do an initial poll of the `ChannelManager` write future immediately #3977

Do an initial poll of the `ChannelManager` write future immediately #3977

ldk-reviews-bot commented Jul 31, 2025 •

edited

Loading

codecov bot commented Jul 31, 2025 •

edited

Loading

joostjager Jul 31, 2025 •

edited

Loading